Exploratory HJB Equations and Their Convergence

نویسندگان

چکیده

We study the exploratory Hamilton–Jacobi–Bellman (HJB) equation arising from entropy-regularized control problem, which was formulated by Wang, Zariphopoulou, and Zhou (J. Mach. Learn. Res., 21 (2020), 198) in context of reinforcement learning continuous time space. establish well-posedness regularity viscosity solution to equation, as well convergence problem classical stochastic when level exploration decays zero. then apply general results obtained temperature introduced Gao, Xu, (SIAM J. Control Optim., 60 (2022), pp. 1250–1268) design an endogenous schedule for simulated annealing nonconvex optimization. derive explicit rate this diminishes zero, find that stationary distribution optimally controlled process exists, is however neither a Dirac mass on global optimum nor Gibbs measure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonlinear HJB Equations

This paper is concerned with the standard finite element approximation of HamiltonJacobi-Bellman Equations (HJB) with nonlinear source terms. Under a realistic condition on the nonlinearity, we characterize the discrete solution as a fixed point of a contraction. As a result of this, we also derive a sharp L∞error estimate of the approximation. Mathematics Subject Classification: Primary 35F21;...

متن کامل

Hjb Equations for Certain Singularly Controlled Diffusions

over the admissible controls U . Both g and κ · u (u ∈ U) may take positive and negative values. This paper studies the corresponding dynamic programming equation (DPE), a second-order degenerate elliptic partial differential equation of HJB-type with a state constraint boundary condition. Under the controllability condition GU = R and the finiteness of H(q) = supu∈U1{−Gu · q− κ · u}, q ∈ R , w...

متن کامل

A New Scheme for Discrete HJB Equations

In this paper we propose a relaxation scheme for solving discrete HJB equations based on scheme II [1] of Lions and Mercier. The convergence of the new scheme has been established. Numerical example shows that the scheme is efficient.

متن کامل

Asymptotic Analysis of Forward Performance Processes in Incomplete Markets and Their Ill-Posed HJB Equations

We consider the problem of optimal portfolio selection under forward investment performance criteria in an incomplete market. The dynamics of the prices of the traded assets depend on a pair of stochastic factors, namely, a slow factor (e.g. a macroeconomic indicator) and a fast factor (e.g. stochastic volatility). We analyze the associated forward performance SPDE and provide explicit formulae...

متن کامل

Pathwise Stochastic Control Problems and Stochastic HJB Equations

In this paper we study a class of pathwise stochastic control problems in which the optimality is allowed to depend on the paths of exogenous noise (or information). Such a phenomenon can be illustrated by considering a particular investor who wants to take advantage of certain extra information but in a completely legal manner. We show that such a control problem may not even have a “minimizin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Siam Journal on Control and Optimization

سال: 2022

ISSN: ['0363-0129', '1095-7138']

DOI: https://doi.org/10.1137/21m1448185